Using the Right Tools: Enhancing Retrieval from Marked-up Documents

نویسندگان

  • Christopher A. Welty
  • Nancy Ide
چکیده

We are experimenting with the representation of a DTD and associated documents (i.e., documents conformant to the DTD) in a knowledge representation (KR) system, in order to provide more sophisticated query and retrieval from TEI documents than current systems provide. We are using CLASSIC, a frame-based representation system developed at AT&T Bell Laboratories. Like many KR systems, CLASSIC enables the definition of structured concepts/frames, their organization into taxonomies, the creation and manipulation of individual instances of such concepts, and inference such as inheritance, relation transitivity, inverses, etc. In addition, CLASSIC provides for the key inferences of subsumption and classification. By representing a document as an individual instance of a hierarchy of concepts derived from the DTD, and by allowing the creation of additional user-defined concepts and relations, sophisticated query and retrieval operations can be performed. This paper describes CLASSIC and the formalism of description logic that underlies it, and demonstrates how it can be used for enhanced retrieval from richly encoded documents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automating XML markup of text documents

We present a novel system for automatically marking up text documents into XML and discuss the benefits of XML markup for intelligent information retrieval. The system uses the Self-Organizing Map (SOM) algorithm to arrange XML marked-up documents on a twodimensional map so that similar documents appear closer to each other. It then employs an inductive learning algorithm C5 to automatically ex...

متن کامل

XML Information Retrieval

Nowadays, increasingly, documents are marked-up using XML, the format standard for structured documents. In contrast to HTML, which is mainly layoutoriented, XML follows the fundamental concept of separating the logical structure of a document from its layout. This document logical structure can be exploited to allow a focused access to documents, where the aim is to return the most relevant fr...

متن کامل

Prototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica

Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...

متن کامل

Automatic Processing of Foreign Language Documents

Experiments conducted over the last few years with the SMART document retrieval system have shown that fully automatic text processing methods using relatively simple linguistic tools are as effective for purposes of document indexing, classification, search, and retrieval as the more elaborate manual methods normally used in practice. Up to now, all experiments were carried out entirely with E...

متن کامل

The Relative generality and precision of Evidence Based Medical Infor-mation Resources in the Recovery of Diabetes Information

Background and Aim: Relative generality and precision are two important criteria for measuring the efficiency and performance of information retrieval systems. The aim of this study was to compare the integrity and location of evidence-based bases in the digital library of Hamedan University of Medical Sciences in data retrieval of diabetes.    Methods: The design of this research is cross-sect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computers and the Humanities

دوره 33  شماره 

صفحات  -

تاریخ انتشار 1999